187 research outputs found

    A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

    Full text link
    Constructing confidence intervals for the coefficients of high-dimensional sparse linear models remains a challenge, mainly because of the complicated limiting distributions of the widely used estimators, such as the lasso. Several methods have been developed for constructing such intervals. Bootstrap lasso+ols is notable for its technical simplicity, good interpretability, and performance that is comparable with that of other more complicated methods. However, bootstrap lasso+ols depends on the beta-min assumption, a theoretic criterion that is often violated in practice. Thus, we introduce a new method, called bootstrap lasso+partial ridge, to relax this assumption. Lasso+partial ridge is a two-stage estimator. First, the lasso is used to select features. Then, the partial ridge is used to refit the coefficients. Simulation results show that bootstrap lasso+partial ridge outperforms bootstrap lasso+ols when there exist small, but nonzero coefficients, a common situation that violates the beta-min assumption. For such coefficients, the confidence intervals constructed using bootstrap lasso+partial ridge have, on average, 50%50\% larger coverage probabilities than those of bootstrap lasso+ols. Bootstrap lasso+partial ridge also has, on average, 35%35\% shorter confidence interval lengths than those of the de-sparsified lasso methods, regardless of whether the linear models are misspecified. Additionally, we provide theoretical guarantees for bootstrap lasso+partial ridge under appropriate conditions, and implement it in the R package "HDCI.

    Pair-switching rerandomization

    Full text link
    Rerandomization discards assignments with covariates unbalanced in the treatment and control groups to improve the estimation and inference efficiency. However, the acceptance-rejection sampling method used by rerandomization is computationally inefficient. As a result, it is time-consuming for classical rerandomization to draw numerous independent assignments, which are necessary for constructing Fisher randomization tests. To address this problem, we propose a pair-switching rerandomization method to draw balanced assignments much efficiently. We show that the difference-in-means estimator is unbiased for the average treatment effect and the Fisher randomization tests are valid under pair-switching rerandomization. In addition, our method is applicable in both non-sequentially and sequentially randomized experiments. We conduct comprehensive simulation studies to compare the finite-sample performances of the proposed method and classical rerandomization. Simulation results indicate that pair-switching rerandomization leads to comparable power of Fisher randomization tests and is 4-18 times faster than classical rerandomization. Finally, we apply the pair-switching rerandomization method to analyze two clinical trial data sets, both demonstrating the advantages of our method

    Regression-adjusted average treatment effect estimates in stratified randomized experiments

    Full text link
    Researchers often use linear regression to analyse randomized experiments to improve treatment effect estimation by adjusting for imbalances of covariates in the treatment and control groups. Our work offers a randomization-based inference framework for regression adjustment in stratified randomized experiments. Under mild conditions, we re-establish the finite population central limit theorem for a stratified experiment. We prove that both the stratified difference-in-means and the regression-adjusted average treatment effect estimators are consistent and asymptotically normal. The asymptotic variance of the latter is no greater and is typically lesser than that of the former. We also provide conservative variance estimators to construct large-sample confidence intervals for the average treatment effect

    Model-assisted complier average treatment effect estimates in randomized experiments with non-compliance and a binary outcome

    Full text link
    In randomized experiments, the actual treatments received by some experimental units may differ from their treatment assignments. This non-compliance issue often occurs in clinical trials, social experiments, and the applications of randomized experiments in many other fields. Under certain assumptions, the average treatment effect for the compliers is identifiable and equal to the ratio of the intention-to-treat effects of the potential outcomes to that of the potential treatment received. To improve the estimation efficiency, we propose three model-assisted estimators for the complier average treatment effect in randomized experiments with a binary outcome. We study their asymptotic properties, compare their efficiencies with that of the Wald estimator, and propose the Neyman-type conservative variance estimators to facilitate valid inferences. Moreover, we extend our methods and theory to estimate the multiplicative complier average treatment effect. Our analysis is randomization-based, allowing the working models to be misspecified. Finally, we conduct simulation studies to illustrate the advantages of the model-assisted methods and apply these analysis methods in a randomized experiment to evaluate the effect of academic services or incentives on academic performance

    Quantifying jet transport properties via large pTp_T hadron production

    Full text link
    Nuclear modification factor RAAR_{AA} for large pTp_T single hadron is studied in a next-to-leading order (NLO) perturbative QCD (pQCD) parton model with medium-modified fragmentation functions (mFFs) due to jet quenching in high-energy heavy-ion collisions. The energy loss of the hard partons in the QGP is incorporated in the mFFs which utilize two most important parameters to characterize the transport properties of the hard parton jets: the jet transport parameter q^0\hat q_{0} and the mean free path λ0\lambda_{0}, both at the initial time τ0\tau_0. A phenomenological study of the experimental data for RAA(pT)R_{AA}(p_{T}) is performed to constrain the two parameters with simultaneous χ2/d.o.f\chi^2/{\rm d.o.f} fits to RHIC as well as LHC data. We obtain for energetic quarks q^0≈1.1±0.2\hat q_{0}\approx 1.1 \pm 0.2 GeV2^2/fm and λ0≈0.4±0.03\lambda_{0}\approx 0.4 \pm 0.03 fm in central Au+AuAu+Au collisions at sNN=200\sqrt{s_{NN}}=200 GeV, while q^0≈1.7±0.3\hat q_{0}\approx 1.7 \pm 0.3 GeV2^2/fm, and λ0≈0.5±0.05\lambda_{0}\approx 0.5 \pm 0.05 fm in central Pb+PbPb+Pb collisions at sNN=2.76\sqrt{s_{NN}}=2.76 TeV. Numerical analysis shows that the best fit favors a multiple scattering picture for the energetic jets propagating through the bulk medium, with a moderate averaged number of gluon emissions. Based on the best constraints for λ0\lambda_{0} and τ0\tau_0, the estimated value for the mean-squared transverse momentum broadening is moderate which implies that the hard jets go through the medium with small reflection.Comment: 8 pages, 6 figures, revised versio

    Lasso adjustments of treatment effect estimates in randomized experiments

    Full text link
    We provide a principled way for investigators to analyze randomized experiments when the number of covariates is large. Investigators often use linear multivariate regression to analyze randomized experiments instead of simply reporting the difference of means between treatment and control groups. Their aim is to reduce the variance of the estimated treatment effect by adjusting for covariates. If there are a large number of covariates relative to the number of observations, regression may perform poorly because of overfitting. In such cases, the Lasso may be helpful. We study the resulting Lasso-based treatment effect estimator under the Neyman-Rubin model of randomized experiments. We present theoretical conditions that guarantee that the estimator is more efficient than the simple difference-of-means estimator, and we provide a conservative estimator of the asymptotic variance, which can yield tighter confidence intervals than the difference-of-means estimator. Simulation and data examples show that Lasso-based adjustment can be advantageous even when the number of covariates is less than the number of observations. Specifically, a variant using Lasso for selection and OLS for estimation performs particularly well, and it chooses a smoothing parameter based on combined performance of Lasso and OLS
    • …
    corecore